Handwritten Script Identification: Fusion based Approaches
نویسندگان
چکیده
Script identification is one of the preprocessing steps in any document image processing task. Script identification in printed documents has achieved a greater attention whereas script identification in handwritten documents has achieved less attention from document research community. Almost all the existing works have made attempts on identifying suitable features or classifiers for handwritten script identification. On the other hand, currently an attention is given towards fusion of features and/or classifiers. To the best of our knowledge only a few works are reported on study of fusion strategies for script identification of handwritten documents. In this work, we present two different models, based on fusion strategies for handwritten script identification of trilingual documents (Kannada, Hindi and English). The first model is feature level fusion where Gabor, LBPV and wavelet features are fused and second model is on decision level fusion where decisions from nearest neighbor and the support vector machine classifiers are used for fusion. Experimentation has been carried out on a data set of three classes each with 100 documents and experimental results ensure that fusion strategies perform better than conventional methods.
منابع مشابه
A new dataset of word-level offline handwritten numeral images from four official Indic scripts and its benchmarking using image transform fusion
Handwritten document image dataset development is one of the most tedious and time consuming tasks in optical character recogniser (OCR) related experimental work. Special attention need to be given in terms of feasibility, realness, clarity etc. while collecting real life data from different writers. Few efforts can be found in the literature for development of handwritten NIdb (numeral image ...
متن کاملHandwritten Script Identification from a Bi-Script Document at Line Level using Gabor Filters
In a country like India where more number of scripts are in use, automatic identification of printed and handwritten script facilitates many important applications including sorting of document images and searching online archives of document images. In this paper, a Gabor feature based approach is presented to identify different Indian scripts from handwritten document images. Eight popular In...
متن کاملWord level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script
India is a multi-lingual country where Roman script is often used alongside different Indic scripts in a text document. To develop a script specific handwritten Optical Character Recognition (OCR) system, it is therefore necessary to identify the scripts of handwritten text correctly. In this paper, we present a system, which automatically separates the scripts of handwritten words from a docum...
متن کاملIndic Handwritten Script Identification using Offline-Online Multimodal Deep Network
In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. The advantages of using character level data for training have been outlined in section I. Our method uses a multimodal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities j...
متن کاملConvolution Based Technique for Indic Script Identification from Handwritten Document Images
Determination of script type of document image is a complex real life problem for a multi-script country like India, where 23 official languages (including English) are present and 13 different scripts are used to write them. Including English and Roman those count become 23 and 13 respectively. The problem becomes more challenging when handwritten documents are considered. In this paper an app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013